Method and Apparatus for a Noise Filter for Reducing Noise in a Image or Video

ABSTRACT

A noise filter method and apparatus for producing at least one of a video or an image with reduced noise. The noise filter method includes performing noise estimation on a frame of at least one of an image or video and applying a low pass filter on the noise level according to the noise estimation, performing spatial filtration on the frame, performing motion detection on a spatially filtered frame, determining motion-to-blending factor conversion and, accordingly, performing frame blending, and outputting a frame with reduced noise.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims benefit of U.S. provisional patent application Ser. No. 61/013,682, filed Dec. 14, 2007, which is herein incorporated by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

Embodiments of the present invention generally relate to a method and apparatus for a noise filter for reducing noise in a noisy image or video.

2. Description of the Related Art

Video and image noise reduction is an important part of video and image processing in both input side and display side of digital consumer electronics. For example, videos captured by digital camcorders, cameras, and video cellular phones under low-light and high ISO gain contain significant amount of noise. Analog video inputs from TV cable and DVD/VCR are also contaminated by transmission noise. The noise not only degrades the video quality, but also hurts the video coding efficiency because the encoder has to spend extra bits to encode the noise.

Therefore, there is a need for a method and/or apparatus for an improved noise filter that reduces noise in a noisy image or video.

SUMMARY OF THE INVENTION

Embodiments of the present invention relate to a noise filter method and apparatus for producing at least one of a video or an image with reduced noise. The noise filter method includes performing noise estimation on a frame of at least one of an image or video and applying a low pass filter on the noise level according to the noise estimation, performing spatial filtration on the frame, performing motion detection on a spatially filtered frame, determining motion-to-blending factor conversion and, accordingly, performing frame blending, and outputting a frame with reduced noise

BRIEF DESCRIPTION OF THE DRAWINGS

So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.

FIG. 1 an embodiment of a block diagram of a noise filter utilizing both spatial filtration and temporal filtration;

FIG. 2 is an embodiment of a blending factor controlled by the motion value

FIG. 3 is an embodiment of an offset α₀ is controlled by the total noise level N_(total);

FIG. 4 is a flow diagram depicting an embodiment of a method for filtering noise utilizing both spatial filtration and temporal filtration; and

FIG. 5 is a flow diagram depicting an embodiment for generating a spatially filtered frame; and

FIG. 6 is a flow diagram depicting an embodiment for noise estimation.

DETAILED DESCRIPTION

For the purposes of this application, a computer readable medium is any medium that may be accessed by a computer for reading, writing, executing, and the like of data and/or computer instructions.

Described herein is a noise filter for video or images that utilizes both spatial filtration and temporal filtration to effectively reduce the noise in noisy videos or images. The filter is adaptive to motion and noise level to achieve constantly good results for moving scenes and videos with changing noise level. The noise filter improves both visual quality and coding efficiency significantly. Even though this application describes the spatial filtration first, the noise estimation may be performed before or simultaneously.

FIG. 1 an embodiment of a block diagram of a noise filter 100 utilizing both spatial filtration and temporal filtration. The noise filter includes a noise level estimation 102, a spatial filter 104, a motion detection 106, a buffer 108.

I(x,y,n) is the input frame 110 and I_(s)(x,y,n) is the output frame 114 of the spatial filter 104.

I _(s)(x,y,n)=F _(s)(I(x,y,n)).

The spatial filter F_(s), of the spatial filter 104 may be applied block-by-block or line-by-line. The spatial filter F_(s), involves three steps, which are discussed below. Note that the steps described may occur in different order.

First is the creation of a hierarchical representation. Hence, an h×v-level (horizontally h-level, vertically v-level) hierarchical representation is created of each frame by successive high-pass and low-pass filtration. The representation is a set of coefficient arrays in every level.

For k-th level, the high-pass filter and low-pass filter are:

f _(L)=[1(2^(k−1)−1)zeros 1], f _(H)=[1(2^(k−1)−1)zeros−1].

Without loss of generality, we assume h≧v. Let I₁=I. Starting from level 1, for the levels 1≦k≦v, apply the filters in the following way:

-   -   Filter I_(k) vertically by f_(L) to create vL_(k).     -   Filter I_(k) vertically by f_(H) to create vH_(k).     -   Filter vL_(k) horizontally by f_(L) to create I_(k+1).     -   Filter vL_(k) horizontally by f_(H) to create vLhH_(k).     -   Filter vH_(k) horizontally by f_(L) to create vHhL_(k).     -   Filter vH_(k) horizontally by f_(H) to create vHhH_(k).         For the levels v<k≦h, apply the filters in the following way:     -   Filter I_(k) horizontally by f_(L) to create I_(k+1).     -   Filter I_(k) horizontally by f_(H) to create hH_(k).

For different system complexity constraints, we can choose different h and v to create spatial filter F_(s), of the spatial filter 104 with different size. For example, if h and v are both 3, the size of F_(s), is 15×15. If h=3 and v=2, the size of F_(s), is 15×7. If h=2 and v=1, the size of F_(s), is 7×3.

Second is the modification of the hierarchical representation. In this step, certain coefficient arrays in k-th level of the hierarchical representation are modified. For levels 1≦k≦v, vLhH_(k), vHhL_(k), vHhH_(k) need to be modified. For levels v<k≦h, hH_(k) need to be modified. For each of these coefficient arrays that need to be modified, we modify all the elements in them by using the following mapping function:

$\begin{matrix} {{y = {{x\mspace{14mu} {for}\mspace{14mu} x} \geq T_{k}}},} \\ {= {{0\mspace{14mu} {for}\mspace{14mu} x} < {T_{k}.}}} \end{matrix}$

T_(k) is the threshold of k-th level which is a scaled version of the noise level N_(f) which will be determined by the noise estimation part.

T _(k) =T _(0k) N _(f).

T_(0k) is an input strength parameter of the k-th level of the spatial noise filter. Larger T_(0k) produces smoother results. Smaller T_(0k) keeps more details. The spatial noise filter for frame n can use T_(k)(n−1) if T_(k)(n) may not available before finishing processing frame n.

Third is the creation of a spatially filtered frame 114, in which modified hierarchical representation is used to create the spatially filtered frame 114. For k-th level, the high-pass filter and low-pass filter are:

f _(L)=[1(2^(k−1)−1)zeros 1],f _(H)=[−1(2^(k−1)−1)zeros 1].

Starting from level h, for the levels v<k≦h, the filters are applied in the following way:

-   -   Filter I_(k+1) horizontally by f_(L) to create hLhL_(k).     -   Filter hH_(k) horizontally by f_(H) to create hHhH_(k).     -   I_(k)=(hLhL_(k)+hHhH_(k))/4.         For the levels 1≦k≦v, apply the filters in the following way:     -   Filter I_(k+1) vertically by f_(L) to create vLhLvL_(k.)     -   Filter vLhLvL_(k) horizontally by h_(L) to create vLhLvLhL_(k).     -   Filter vLhH_(k) vertically by f_(L) to create vLhHvL_(k).     -   Filter vLhHvL_(k) horizontally by h_(H) to create vLhHvLhH_(k).     -   Filter vHhL_(k) vertically by f_(H) to create vHhLvH_(k).     -   Filter vHhLvH_(k) horizontally by h_(L) to create vHhLvHhL_(k).     -   Filter vHhH_(k) vertically by f_(H) to create vHhHvH_(k).     -   Filter vHhHvH_(k) horizontally by h_(H) to create vHhHvHhH_(k).     -   I_(k)=(vLhLvLhL_(k)+vLhHvLhH_(k)+vHhLvHhL_(k)+vHhHvHhH_(k))/16         The spatially filtered frame 114 is I_(s)=I₁. A color frame         contains three channels: Y, U, V. The spatial filter is applied         on each color channel independently.

In addition to accounting for and applying the spatial filter, the noise filter also estimates the noise. The noise estimation contains three steps, which are described herein below.

First is estimating the noise for each block/line. The frame is processed either block-by-block or line-by-line. So we first estimate a noise level N_(i) for i-th block or line. In one embodiment, one of two methods may be utilized to estimate N_(i). One method is based on spatial information and the other is based on temporal information. They can be chosen based on the application.

In N_(i) estimation based on spatial information, N_(i) is the mean absolute value of the coefficient array given at the first level of the hierarchical representation.

N _(i)=mean(|vHhH _(1i)|).

vHhH_(1i) is the i-th block or line of the coefficient array vHhH₁.

In the N_(i) estimation based on temporal information, N_(i) is the mean absolute difference between the input frame I 110 and a reference frame I_(p) 116.

N _(i)=mean(|I _(i) −I _(pi)|).

I_(i) is the i-th block or line of the input frame I 110. I_(pi) is the i-th block or line of the reference frame I_(p) 116.

Second is estimating noise for a frame. After we have N_(i) for all i, the noise level of the frame is the mean, or the median, or the minimum of N_(i). They can be chosen based on the application.

-   -   N=mean(N_(i)) for all i.     -   Or N=median(N_(i)) for all i.     -   Or N=min(N_(i)) for all i.

Third, the noise level should change slowly in a video sequence. So a low-pass IIR filter is applied on the noise level. N(n) denotes the noise level of the n-th frame and N_(f)(n) denotes the noise level after the low-pass filtration.

N _(f)(n)=βN _(f)(n−1)+(1−β)N(n).

β is the coefficient of the IIR filter which controls how fast the noise level changes frame-to-frame. The noise estimation is performed on each color channel independently. Each color channel has its own noise level.

There are three steps for the temporal filtration. The temporal filter can also be applied block-by-block or line-by-line, which are motion detection, Motion-to-blending factor conversion and frame blending.

In the Motion detection, the reference frame I_(p)(x,y,n) 116 is the previous output frame stored in the buffer 108.

I _(p)(x,y,n)=I _(o)(x,y,n−1).

The motion value at (x, y) is just the absolute difference between the spatially filtered frame 114 and the reference frame 116 for all three color channels:

m(x,y,n)=|I _(s) _(—) _(Y)(x,y,n)−I _(p) _(—) _(Y)(x,y,n)|+|I _(s) _(—) _(U)(x,y,n)−I _(p) _(—) _(U)(x,y,n)|+|I _(s) _(—) _(V)(x,y,n)−I _(p) _(—) _(V)(x,y,n)|.

I_(s) _(—) _(Y), I_(s) _(—) _(U), I_(s) _(—) _(v) are the three color channels of I_(s) 114 I_(p) _(—) _(Y), I_(p) _(—) _(U), I_(p) _(—) _(V) are the three color channels of I_(p) 116.

Since the motion detection is working on the spatially filtered frames I_(s) 114 and the previously filtered frame I_(p) 116, it is much more robust than the motion detection working on original noisy frames.

In the motion-to-blending factor conversion step, if there is little motion, the temporal filtration result is more reliable. If there is large motion, the spatial filtration result is more reliable. FIG. 2 is an embodiment of a blending factor controlled by the motion value. As shown in FIG. 2, a blending factor for each pixel at x, y is defined as:

$\begin{matrix} {{\alpha \left( {x,y,n} \right)} = {\alpha_{0} + {\left( {1 - \alpha_{0}} \right){{m\left( {x,y,n} \right)}/T_{m}}}}} & {{{{{if}\mspace{14mu} {m\left( {x,y,n} \right)}} < T_{m}},}} \\ {= 1} & {{{else}.}} \end{matrix}$

T_(m) is an input parameter of the temporal filter. Flat areas look smoother when T_(m) increases. But larger T_(m) causes more “ghosting” artifacts on moving areas. α₀ is the offset of the motion-blending factor function in FIG. 2.

FIG. 3 is an embodiment of an offset α₀ is controlled by the total noise level N_(total). As shown in FIG. 3, it is controlled by the total noise level of the three color channels:

$\begin{matrix} {\alpha_{0} = {1 - {N_{total}/T_{\alpha \; 0}}}} & {{if}} & {{N_{total} < T_{{\alpha \; 0},}}} \\ {= 0} & {{{else}.}} & \; \end{matrix}$

N_(total) is the total noise level of all the three channels:

N _(total) =N _(f) _(—) _(Y) +N _(f) _(—U) +N _(f) _(—) _(V).

T_(α0) is a register to control the slope of the function in FIG. 3. This function makes α₀ to be close to 1 if the noise level is low, and therefore the temporal filter to be very weak to avoid ghosting artifacts.

In the frame blending, the output frame 112 is an weighted averaging of I_(s)(x,y,n) 114 and I_(p)(x,y,n) 116:

$\begin{matrix} {{{I_{o}\left( {x,y,n} \right)} = {{\alpha \; {I_{s}\left( {x,y,n} \right)}} + {\left( {1 - \alpha} \right){I_{o}\left( {x,y,{n - 1}} \right)}}}},} \\ {= {{\alpha \; {I_{s}\left( {x,y,n} \right)}} + {\left( {1 - \alpha} \right){{I_{p}\left( {x,y,n} \right)}.}}}} \end{matrix}$

The spatial filter may or may not be the same as the image filter used. In one embodiment, the horizontal level and vertical level (u and v) may be different. The image filter used may only handle the case when u=v.

FIG. 4 is a flow diagram depicting an embodiment of a filtering noise method 400 utilizing both spatial filtration and temporal filtration. The method starts at step 402 and proceeds to step 404. At step 404, a new frame is received. At step 406, the method 400 performs a noise estimation, which is better discussed in FIG. 1 and FIG. 6. At step 408, the method 400 performs spatial filtration, which is better described in FIG. 1 and FIG. 5. At step 410, the method performs motion detection, as described in FIG. 1. At step 412, the method 400 performs motion-to-blending factor conversion as described in FIG. 1, FIG. 2 and FIG. 3. At step 414, the method 400 outputs a filtered frame. At step 418, the method 400 determines if the frame processed is the last frame. If the frame is not the last frame, the method 400 proceeds from step 418 to step 404. If there is the last frame, the method 400 ends at step 420.

FIG. 5 is a flow diagram depicting an embodiment of a method 500 for generating a spatially filtered frame. The method starts at step 502 and proceeds to step 504. At step 504, the method receives new frames. At step 506, the method creates hierarchical representation. At step 508, coefficients in k-th level of the created hierarchical representation are modified. At step 510, the method 500 creates a spatially filtered frame. At step 512, a spatially filtered frame is outputted. The method 500 ends at step 514.

FIG. 6 is a flow diagram depicting an embodiment of a method 600 for noise estimation. The method 600 starts at step 602. At step 604, a new frame is received. At step 606, the method 600 calculates noise level of one or more blocks and/or lines. At step 608, the method 600 calculates the noise level of the frame. At step 610, the method 600 applies a low pass filter on the noise level. At step 612, a noise level is outputted. The method 600 ends at step 614.

While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. 

1. A noise filter method for producing at least one of a video or an image with reduced noise, comprising: performing noise estimation on a frame of at least one of an image or video and applying a low pass filter on the noise level; performing spatial filtration on the frame according to the noise estimation; performing motion detection on a spatially filtered frame; determining motion-to-blending factor conversion according to the noise estimation and, accordingly, performing frame blending; and outputting a frame with reduced noise.
 2. The noise filter method of claim 1, wherein the step of performing spatial filtration comprises: creating a hierarchical representation of the frame; modifying a coefficient in k-th level of the created hierarchical according to the noise estimation; and producing the spatially filtered frame.
 3. The noise filter method of claim 1, wherein the step of estimating noise comprises: calculating noise level of at least a block or a line; calculating noise level of the frame; and applying a low pass filter on the noise level.
 4. A computer readable medium comprising computer instructions, which when executed perform a noise filter method for producing at least one of a video or an image with reduced noise, the method comprising: performing noise estimation on a frame of at least one of an image or video and applying a low pass filter on the noise level; performing spatial filtration on the frame according to the noise estimation; performing motion detection on a spatially filtered frame; determining motion-to-blending factor conversion according to the noise estimation and, accordingly, performing frame blending; and outputting a frame with reduced noise.
 5. The computer readable medium of claim 4, wherein the step of performing spatial filtration of the noise filter method comprises: creating a hierarchical representation of the frame; modifying a coefficient in k-th level of the created hierarchical according to the noise estimation; and producing the spatially filtered frame.
 6. The computer readable medium of claim 4, wherein the step of estimating noise of the noise filter method comprises: calculating noise level of at least a block or a line; calculating noise level of the frame; and applying a low pass filter on the noise level.
 7. An apparatus, comprising: means for performing noise estimation on a frame of at least one of an image or video and applying a low pass filter on the noise level; means for performing spatial filtration on the frame according to the noise estimation; means for performing motion detection on a spatially filtered frame; and means for determining motion-to-blending factor conversion according to the noise estimation and, accordingly, performing frame blending.
 8. The apparatus of claim 7, wherein the means for performing spatial filtration comprises: means for creating a hierarchical representation of the frame; means for modifying a coefficient in k-th level of the created hierarchical according to the noise estimation; and means for producing the spatially filtered frame.
 9. The apparatus of claim 7, wherein the means for estimating noise comprises: means for calculating noise level of at least a block or a line; means for calculating noise level of the frame; and applying a low pass filter on the noise level. 